Libraries


Let’s load some libraries needed

# Libraries
library(tidyverse)   # includes ggplot2
library(hrbrthemes)  # better chart appearance
library(viridis)     # better color palette
library(plotly)      # interactive charts
library(partykit)    # includes analysis for conditional inference trees

Data wrangling


Load all the data we are going to look at - in this case all back rows through testing in the last 5 years, completing both CMJ, reactive strength & speed testing.

A rough summary of the data can be viewed below:

back_row_data <- read.csv("data/data.csv")

head(back_row_data)
##            name body_weight jump_height peak_power rel_power  RSI time_10
## 1     Aled Ward       105.3        18.8       5193     52.08 2.49    1.69
## 2     Alex Mann        95.2        24.4       5930     62.13 3.11    1.60
## 3 Alun Lawrence       105.7        20.1       5786     54.72 1.72    1.71
## 4   Andy Powell       117.0        19.0       5632     48.13 1.62    1.77
## 5    Ben Davies        88.9        20.9       5044     64.04 2.83    1.76
## 6       Ben Fry       115.1        25.2       5575     54.29 2.97    1.75
##   time_20 time_30 time_40 max_v X10_momentum regional
## 1    3.04    4.27    5.39  8.13     619.4118        0
## 2    2.79    3.93    5.10  8.77     586.2500        0
## 3    2.94    4.13    5.34  8.40     618.1287        1
## 4    3.04    4.21    5.41  8.55     661.0169        1
## 5    3.06    4.26    5.48  8.55     486.8132        0
## 6    3.12    4.25    5.47  8.85     586.3636        1

Let’s build a chart


It’s nice to see the data visually to get a hold on what it contains, in this case it might be interesting to look at whether relative peak power output, measured by CMJ, has any correlation to 10m sprint time.

We can make this interactive, so the data can be further explored.

# Interactive scatter plot - PPO vs 10m Sprint
p <- back_row_data %>%
  mutate(text = paste("Name: ", name, "\nBody Weight (KG): ", body_weight, "\nPeak Power (W): ", peak_power, sep = "")) %>%
  ggplot( aes(x = rel_power, y = time_10, text = text)) +
  geom_point(alpha = 0.7) +
  theme_ipsum() +
  theme(legend.position = "none")
ggplotly(p, tooltip = "text")

Further exploration

Relative power output also nicely predicts which speed band athletes will fall into:

## conditional inference tree on max velocity using relative power
max_v <- back_row_data$max_v

cut(max_v, 3, include_lowest = T, labels = c("Slow", "Average", "Fast"))

back_row_data <- back_row_data %>%
  mutate(
    speed = cut(time_10, 3, include_lowest = T, labels = c("Fast", "Medium", "Slow"))
  )

ctree_formula <- speed ~ rel_power

speed_ctree <- ctree(ctree_formula, data = back_row_data)

plot(speed_ctree)